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ABSTRACT 

Deriving the total masses of galaxy clusters from observations of the intracluster 
medium (ICM) generally requires some prior information, in addition to the assump- 
tions of hydrostatic equilibrium and spherical symmetry. Often, this information takes 
the form of particular parametrized functions used to describe the cluster gas density 
and temperature profiles. In this paper, we investigate the implicit priors on hydro- 
static masses that result from this fully parametric approach, and the implications of 
such priors for scaling relations formed from those masses. We show that the applica- 
tion of such fully parametric models of the ICM naturally imposes a prior on the slopes 
of the derived scaling relations, favoring the self-similar model, and argue that this 
prior may be influential in practice. In contrast, this bias does not exist for techniques 
which adopt an explicit prior on the form of the mass profile but describe the ICM 
non-parametrically. Constraints on the slope of the cluster mass-temperature relation 
in the literature show a separation based the approach employed, with the results from 
fully parametric ICM modeling clustering nearer the self-similar value. Given that a 
primary goal of scaling relation analyses is to test the self-similar model, the appli- 
cation of methods subject to strong, implicit priors should be avoided. Alternative 
methods and best practices are discussed. 



1 INTRODUCTION 

Scaling relations between observable properties and total 
gravitating mass are a critical ingredient for cosmologi- 
cal tests based on galaxy clusters (for a review, see Allen, 
Evrard, & Mantz 2011). For tests using the abundance, clus- 
tering and growth of clusters, scaling relations provide es- 
sential mass proxies and are fundamentally important in ac- 
counting for selection biases. Our knowledge of these rela- 
tions and the systematics that affect them currently limits 
the achievable constraints on some cosmological parameters 
(e.g. Mantz et al. 2008, 2010a,b; Vikhlinin et al. 2009a,b; 
Rozo et al. 2010; Wu et al. 2010, and references therein). 
Measurements of cluster gas mass fractions, which constrain 
the mean matter density cosmic expansion history (e.g. 
Sasaki 1996; Allen et al. 2004, 2008), can also be expressed in 
terms of a scaling relation, namely gas mass as a function of 
total mass. In addition, the scaling relations are of consider- 
able astrophysical interest, reflecting the complex response 
of the baryonic components of these systems to their overall 
gravitational potentials, environments and formation histo- 



ries. For example, departures from the self-similar form in- 
troduced by Kaiser (1986) provide clues to non-gravitational 
processes at work in clusters (e.g. Voit 2005 and references 
therein) . 

In previous work, we have emphasized the need to model 
covariance between measured quantities in the analysis of 
cluster scaling relations (Mantz et al. 2010a,b). Here, we 
distinguish further between various contributing factors to 
such covariance: 

(i) Covariance that is intrinsic to the measurement pro- 
cess. For example, when multiple quantities are measured 
from an X-ray observation, Poisson uncertainties due to pho- 
ton counting affect these quantities in a coherent rather than 
independent way. 

(ii) Covariance due to explicit use of one measurement to 
inform another. For example, if the Compton Y signal from a 
Sunyaev-Zel'dovich observation is measured within a radius 
determined from an X-ray observation, the statistical error 
in the radius determination coherently affects the errors on 
Y and on any X-ray quantities measured within that radius. 

(iii) Covariance that is introduced by models that are fit- 
ted to the cluster data. 
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The first concern above is straightforwardly addressed 
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by jointly fitting or measuring all quantities of interest from 
the observations, and propagating the measurement errors 
using Monte Carlo sampling. The distribution of the result- 
ing samples automatically contains all the information about 
the measurement covariance. 

Unlike the first, the second and third issues results from 
decisions on the part of the observer. For the second issue 
noted above, the use of Monte Carlo sampling to handle the 
error propagation can again allow the measurement covari- 
ance to be straightforwardly understood. 

In this paper, we are concerned with the third issue 
identified above, in particular as it applies to mass mea- 
surements of galaxy clusters based on the assumption of hy- 
drostatic equilibrium (HSE), and scaling relations that are 
formed with such masses. We argue below that some widely 
employed procedures used to model cluster data introduce 
strong priors that can influence the resulting scaling relation 
constraints, and thus hamper our ability to perform robust 
astrophysical and cosmological measurements. Fortunately, 
there are simple, alternative approaches that do not suffer 
from these problems, which are also discussed here. 

The paper is organized as follows. Section 2 provides 
a brief introduction to galaxy cluster mass measurements, 
scaling relations and the self-similar model. In Section 3, 
we discuss the various methods for estimating hydrostatic 
masses that have been proposed, with particular emphasis 
on the priors that each methods imposes on both masses and 
the resulting scaling relations. In Section 4, we review and 
discuss results on the mass-temperature relation slope from 
the literature, with attention to the impact of these modeling 
priors. Our conclusions are summarized in Section 5. 



2 BACKGROUND 

2.1 Hydrostatic mass estimates and scaling 
relations 

Many galaxy cluster mass estimates in the literature are 
based on X-ray observations. 1 X-ray data provide two ob- 
servables that scale physically with total mass, namely the 
luminosity in the observed energy band and the tempera- 
ture of the X-ray emitting, hot intracluster medium (ICM). 
Under the assumption of spherical symmetry, spectral and 
surface brightness data measured in projection can be de- 
projected, yielding three-dimensional profiles of emissivity 
and temperature. These two can be combined to infer the 
ICM density profile, 2 and thus the gas mass, as well as the 
bolometric luminosity (e.g. Sarazin 1988). 

For clusters that are approximately spherical and close 
to HSE, such data can also be used to constrain the to- 
tal mass profile. Specifically, HSE implies a relationship be- 
tween the density and temperature profiles of the ICM, n(r) 
and T(r), and the total mass, 



1 While we focus on X-ray methodology and results, the central 
aspects of our discussion also apply to mass estimates based on 
other data such as the Sunyaev-Zel'dovich effect, galaxy number 
density or velocity dispersion. 

2 For the typical case of hot (kT > 3keV), low-redshift clusters, 
and luminosity measured in the soft X-ray band (e.g. 0.5-2.0 keV) , 
this conversion is essentially independent of temperature. 



kT(r)r { d\nn dlnTN 

where k is Boltzmann's constant, G is Newton's constant, 
and pm p is the mean molecular weight. 

Conventionally, scaling relations are formed by relating 
the observables of interest to the total mass, Ma, within a 
particular radius, ta, jointly defined by 

Ma = ^-Ap c (z)r 3 A , (2) 

where p c (z) is the critical density of the universe at the 
cluster's redshift. Typical choices for A range from 2500 (in- 
termediate radius) to 200 (approximately the virial radius). 
Combining these equations, one can immediately write 
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Clusters forming from idealized, spherical gravitational 
collapse with no additional heating or cooling are expected 
to have self-similar (i.e. described by the same function of 
r/rA for every cluster) gas and dark matter density profiles. 
Assuming HSE, the temperature profiles of clusters will also 
have a self-similar shape (though not a common normaliza- 
tion). This case was studied by Kaiser (1986), who derived 
power-law predictions for scaling relations using masses de- 
fined by Equation 2: 3 

M gaSi A oc Ma, 



Ta oc [ Pc ( z )^ 2 Ma] 2/3 . 
Pc (z) 1/2 Y A oc [ Pc (z)^Ma] 5/3 . 
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A constant gas mass fraction (the first line) is a direct con- 
sequence of the self-similar hypothesis, while the second 
follows from Equation 3, since self-similarity implies that 
J-{r A ) is a constant. Here Y oc J dV nkT is the integrated, 
intrinsic Sunyaev-Zel'dovich signal (i.e. the thermal energy 
of the gas), and L refers to the bremsstrahlung luminosity 
of the plasma (L oc J dV r^T 1 ^ 2 ). Ta may be the tem- 
perature at radius ta or some weighted average of T(r) 
within ta, since the scalings are identical given self simi- 
larity \T(ta)/Ta is constant]. 

For simplicity, we will henceforth eliminate most of the 
constants, setting 



A: 



4tt 



Ap c (z) 



(6) 



finipG 

In practice, the redshift dependence represented by p c {z) 
must be properly accounted for; however, it is incidental 
to the focus of this work. By eliminating these terms, we 



3 Often the factors pc(z) 1 / 2 are written in terms of E(z), the 
normalized Hubble parameter. 
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effectively consider the simplified case of scaling relations at 
a single redshift and fixed density contrast. 

2.2 A simple example: the isothermal j3 model 

As an example, we consider the isothermal f3 model. This 
case is deliberately simplistic (indeed, the results below are 
well known), but it serves to illustrate some features of HSE 
mass estimation using parametrized models that are relevant 
to our discussion in Section 3. 

In this model, the three-dimensional gas density and 
temperature profiles are parametrized by 



n(r) 
T(r) 



n 1 + 



To. 



-3/3/2 



The gas mass is given by 



M gas (r) 



innor 



2 Ft 



2' 2 P ' 2' 
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where 2-Fi(a, b; c; z) is the Gauss hypergeometric function. 
Applying the hydrostatic equation, the mass profile is 

M(r) = 3PT r 3 (-^ ) , (9) 

which yields a solution for the characteristic radius, 



(10) 



From this, we can write the relationship between tempera- 
ture and mass for clusters described by this model, 



To = 



1 + (r c /r A ) 2 M 2/3 



(11) 



In the self-similar case, all clusters have the same values 
of j3 and r c /r/\, and so the self-similar scaling Ma oc Tq^ 2 
follows directly from Equation 11. Furthermore, the hyper- 
geometric function in M gaa (rA) (Equation 8) assumes a con- 
stant value, leading to a constant gas mass fraction, and the 
other scaling laws in Equation 5 follow straightforwardly: 
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Conversely, departures from self-similarity result in 
changes to these scaling laws. For example, consider the case 
in which r c /rA remains constant, but /3 varies from cluster 
to cluster. The expectation value of the characteristic mass 
at fixed temperature can be written as 



<M A |r ) 
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where P(/3\To) is the distribution of j3 values for clusters with 
temperature To. Thus, if /3 varies systematically with tem- 
perature as (/3 3/2 | T ) « Tg, then the M-T slope implied 
by this model is modified to 3/2 + a (Figure 1). The general- 
ization when both /3 and r c /r A vary is straightforward, and 
the effect on the other scaling laws can be derived similarly. 



3 APPROACHES TO MEASURING 
HYDROSTATIC MASSES 

3.1 Fully parametric 

When deriving hydrostatic cluster masses, a common prac- 
tice is to fit parametric functions for the three-dimensional 
gas density and temperature profiles to the observed sur- 
face brightness and temperature data, and then to derive 
the total cluster mass profile using Equation 1. For com- 
parison with the other methods described below, it should 
be noted that selecting parametrized models for n(r) and 
T(r) is completely equivalent (via Equation 1) to choosing 
parametrizations for M(r) and T(r), and thus implicitly im- 
poses a prior on the form of M(r). Because these functions 
share parameters, varying model parameters produces co- 
variance in M and T. That is, the choice of parametrized 
models also constitutes an implicit prior on the scaling re- 
lations, as described below. 

Generalizing Equation 13, we can write the mean mass- 
temperature relation resulting from fits to parametrized n(r) 
and T(r) models as (Equation 3) 



<M a |Ta) = deP(8\T A )T(rA; 
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where 9 represents the full set of parameters describing n(r) 
and T(r), and Ta is the measured gas temperature used 
to form the scaling relation. Typically, Ta is an emission- 
weighted average, dominated by the signal from relatively 
small radii, in which case T(r A )/T A is a measure of the 
overall shape of the temperature profile. For self-similar clus- 
ters, both T{r A )/T A and J-{r A ) are constant, 4 and the self- 
similar slope of 3/2 is trivially recovered. However, if either 
of these quantities varies systematically with mass, the slope 
may be perturbed from the self-similar value. 

The explicit appearance of the exponent 3/2 in Equa- 
tion 14 makes clear that, in practice, our ability to detect 
departures from self-similarity using this approach depends 
on measuring the shape of the temperature and density pro- 
files at ta. In the case of temperature, this is a challenging 
task for current X-ray observatories at even intermediate 
cluster radii (e.g. A = 500, a common choice). Furthermore, 
and not incidentally, the priors on the forms of the temper- 
ature and density (or mass, equivalently) profiles must be 
flexible enough to admit departures from self-similarity. In 
practice, the parametrizations employed are generally mo- 
tivated by observable features of the surface brightness and 
temperature profiles, raising the possibility that the rela- 
tively low signal-to-noise at intermediate radii, and subse- 
quent assumption of regular behavior in the profiles [i.e. sim- 
ilar values of T(rA) /Ta and T(r A )] , produces a bias favoring 
self-similarity. 

Conversely, if the parametrizations provide too much 
flexibility near r& to be effectively constrained by the data, 
then departures from self-similarity cannot be constrained 



4 In principle, instrument-specific effects might make the ratio of 
T(r A ) to measured Ta vary, even for self-similar clusters. We do 
not consider such effects here. 
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Figure 1. Left: Illustration of some of the simple scaling behaviors available to fully parametrized cluster models such as the isothermal 
model. The solid, blue line represents an exactly self-similar scaling relation, which is expected when the model is too restrictive (does not 
permit departures from self-similarity). If parameters that break self-similarity vary completely at random (e.g. if they are unconstrained 
by data), then the mean relation is still self-similar, with a scatter determined by the range over which the model parameters can vary 
(dotted, blue lines; see also right panel). However, if there is a systematic trend of the model parameters with mass or temperature, the 
slope of mean scaling relation can be perturbed (dashed, red line), and in general the relation need not be a power-law. Right: Numerical 
demonstration (see Appendix A) of a self-similar scaling relation arising from random realizations of parametrized density and (non- 
isothermal) temperature profiles (corresponding to the case illustrated by dotted lines in the left panel). Here is the emission- weighted 
projected temperature within r/\. Although the model parameters are chosen randomly, and because their randomization is not mass 
dependent, the structure of the hydrostatic equation results in a self-similar mean scaling relation (red line). Appendix A contains details 
of the parametrized models used and the allowed ranges for the model parameters. 



either. In the extreme case where the data provide no con- 
straint at all on the profiles near ta, the bracketed expres- 
sion in Equation 14 simply samples the prior (the allowed re- 
gion in model parameter space). If that prior is independent 
of Ta, as common practice would dictate, the resulting scal- 
ing relation must have the self-similar slope on average, with 
the prior simply determining the size of the scatter about the 
relation. This behavior is explicitly demonstrated in right 
panel of Figure 1, which shows a mass-temperature rela- 
tion resulting from random realizations of a non-isothermal, 
parametrized cluster model (see Appendix A). The random- 
ized density and temperature profile models vary widely in 
shape and normalization, but because these variations have 
no mass dependence, the structure of Equation 14 results in 
a self-similar mean scaling relation. 

Thus, the inability of current observatories to constrain 
high-resolution temperature profiles at the radii of interest 
poses a dilemma for the fully parametric approach to mass 
estimation. Allowing too little freedom in the adopted forms 
of the n(r) and T(r) profiles risks assuming implicitly that 
clusters are self-similar. On the other hand, allowing too 
much freedom can result in the profiles at ta being so poorly 
constrained that departures from self-similarity in individual 
clusters cannot be constrained either; based on the argument 
above, this case may well also result in an apparently self- 
similar scaling relation on average. 

Apart from temperature, the fully parametric mass es- 
timate depends on the shape of n(r) at ta - Both M gas and Y 
have a dependence on this quantity, being integrals of n(r) 
weighted towards large radii. However, the surface bright- 
ness profile can be determined at much higher resolution 
than temperature from X-ray data, meaning that priors on 



the shape of n(r) need not be as influential. Provided that 
the choice of density parametrization is not overly restric- 
tive, we would thus expect biases towards self similarity to 
be less of a concern for the M gas -M relation compared to 
T-M or Y-M. The X-ray luminosity-mass relation should 
be essentially free of this bias, since it is dominated by emis- 
sion from the dense gas at cluster centers, at radii typically 
•C ta- It is therefore interesting to note that the L-M re- 
lation is the only one of these scalings for which the fully 
parametric approach to mass estimation has consistently 
measured strong departures from self-similarity in the slope 
(e.g. Vikhlinin et al. 2009a; see also Section 4). 



3.2 Semi-parametric 

An alternative method for determining cluster hydrostatic 
mass profiles was developed by Fabian et al. (1981, see also 
White, Jones, & Forman 1997; Allen, Schmidt, & Fabian 
2001; Schmidt, Allen, & Fabian 2001). In this approach, 
a functional form for the total mass profile is explicitly 
adopted, and used in conjunction with a non-parametric de- 
scription of the surface brightness to predict the tempera- 
ture in concentric shells. Temperature measurements from 
spectral data then provide the means to constrain the pa- 
rameters of the mass model. 

In contrast to fully parametric methods, the semi- 
parametric approach does not restrict the forms of the ICM 
density or temperature profiles. Apart from the regulariza- 
tion imposed by the size of the annular regions analyzed (a 
factor in all of the approaches discussed here) , these profiles 
are not constrained a priori. Whereas the fully paramet- 
ric approach implies an implicit prior on the form of M(r), 
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the semi-parametric method explicitly adopts a prior on the 
form of this function, typically motivated by numerical sim- 
ulations of cluster formation (e.g. the model of Navarro, 
Frenk, & White 1997, hereafter NFW). 

As a consequence, the semi-parametric approach does 
not impose a prior on the form of the mass-temperature re- 
lation (or the other scaling relations) in the sense of Equa- 
tion 14. That is not to say that the procedure imposes no 
priors at all; the choice of a particular form for the mass 
profile explicitly does so. In this case, however, the effect of 
the prior on the mass reconstruction is completely transpar- 
ent, and the goodness of fit furthermore provides a means 
to evaluate the mass model, a significant advantage. 

3.3 Non-parametric 

The most general possibility for hydrostatic mass analy- 
sis is a fully non-parametric de-projection. Examples in- 
clude the methods introduced by Arabadjis et al. (2004) 
and Ameglio et al. (2009), in which numerical derivatives 
of non-parametric ICM density and temperature profiles 
are directly used to reconstruct the enclosed mass, subject 
to the constraint that mass increase with radius; and by 
Nulsen et al. (2010), in which the total densities in concen- 
tric spherical shells are free model parameters. In a sense, 
these approaches are, respectively, logical extensions of the 
fully parametric methods, which use the derivatives of n(r) 
and T(r) to derive the mass, and the semi-parametric meth- 
ods, which model the mass profile directly. However, these 
non-parametric methods require very high quality data com- 
pared to methods which impose some kind of prior on the 
mass distribution; as has already been mentioned, X-ray 
data typically cannot resolve the temperature gradient near 
rsoo- The use of these approaches has thus been relatively 
limited. 



3.4 Non-hydrostatic proxies 

Finally, the explicit assumption of HSE can be bypassed by 
estimating mass using a proxy (e.g. M gas or Yx = M gas T&) 
from an external scaling relation. This approach clearly car- 
ries its own prior, namely the validity of the mass proxy, 
which must be verified and calibrated using true mass de- 
terminations. There are also restrictions on what scaling re- 
lations can sensibly be investigated using this technique; for 
example, given its definition, Yx-derived masses should not 
be used to investigate scalings with gas mass or temperature. 
On the other hand, for hot clusters (kT >, 4keV), Af gas is a 
good mass proxy whose determination is essentially indepen- 
dent of temperature (Allen et al. 2008), so masses estimated 
from M gas can reasonably be used to study the M-T relation 
in this mass range (e.g. Mantz et al. 2010a). The appropri- 
ate use of mass proxies can thus potentially increase the 
available sample size and redshift range for studying some 
scaling relations. 



4 META-ANALYSIS OF CONSTRAINTS ON 
THE MASS-TEMPERATURE SLOPE 

The comparison of scaling relations derived in different 
works is complicated by a variety of potential systematics, 
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Figure 2. Mass-tompcrature slopes and 68.3 per cent uncertain- 
ties from the literature (see text for citations). In some cases, 
we have included multiple results from the same paper, based on 
different data sets or mass models. Red circles indicate results 
obtained from fitting parametric n(r) and T(r) models; blue tri- 
angles show results using a non-parametric description of the ICM 
along with an explicit prior on the form of the total mass profile; 
and the green square uses gas mass as a proxy for total mass. The 
dashed line shows the self-similar value of the slope. 



including (potentially redshift-dependent) selection effects, 
instrument cross-calibration, and the use of different regres- 
sion methods over the years, in addition to the issues dis- 
cussed in this paper (see also Appendix B). Nevertheless, it 
is interesting to test whether there is any trend in scaling 
relation results from the literature with the mass modeling 
technique employed. 

Here we focus on mass-temperature relations measured 
from X-ray data, which have a particularly long history. A 
sampling of M-T slopes and reported uncertainties from the 
literature over the past 12 years is shown in Figure 2 (Horner 
et al. 1999; Finoguenov et al. 2001; Arnaud et al. 2005; 
Popesso et al. 2005; Vikhlinin et al. 2006, 2009a; Morandi 
et al. 2007; Allen et al. 2008; Sun et al. 2009; Juett et al. 
2010; Mantz et al. 2010a). In some cases, we have included 
multiple results from the same authors, where different data 
sets or mass models produced noticeably different results. 
In the figure, red circles are results obtained by fitting fully 
parametric n(r) and T(r) models. Blue triangles indicate re- 
sults using a non-parametric description of the ICM along 
with an explicit prior on the form of the total mass pro- 
file (semi-parametric methods). The green square reflects a 
study of massive clusters where gas mass was used as a proxy 
for total mass. Temperature measurements used in the dis- 
played results are all emission-weighted averages, and masses 
in a given study are estimated at a constant value of A (see 
below). 

Among the results based on fits to very simple n(r) and 
T(r) models are: 

(i) Two results from Horner et al. (1999) where masses 
were estimated from X-ray observations. The first slope, 
1.48 ± 0.12, is from a heterogeneous sample with measured 
temperature profiles. For the second, the isothermal f3 model 
was fitted to the data of Fukazawa (1997), resulting in a 
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slope of 1.78 ± 0.05. In both cases (as well as their third 
result, below), masses were rescaled to A = 200. 

(ii) The first two results from Finoguenov et al. (2001) 
use a compilation of clusters for which resolved temperature 
profiles were available, with masses estimated at A = 500 by 
fitting a j3 model density profile and assuming a polytropic 
relationship between gas density and temperature. The first 
slope, 1.78 ± 0.10, uses the entire sample, while the second, 
1.58 ± 0.05, was obtained by excluding 4 clusters with mea- 
sured P < 0.4. 

(iii) The third result shown from Finoguenov et al. (2001) 
and the slope from Popesso et al. (2005) were derived by fit- 
ting different subsets of the data of Reiprich & Bohringer 
(2002), respectively finding slopes of 1.64 ± 0.05 and 1.59 ± 
0.04. The Reiprich & Bohringer (2002) analysis provides 
masses at A = 500 from isothermal f3 model fits. 

More recent works using n(r) and T(r) fits generally 
have used more complicated models, which are detailed in 
the respective papers: 

(i) Arnaud et al. (2005) used masses measured by Pointe- 
couteau et al. (2005), who fitted functions for n(r) and T(r) 
to X-ray data. The figure shows their slope of 1.49 ± 0.14 
for clusters with kT > 3.5 keV at A = 1000, the largest ra- 
dius for which no extrapolation was required. Their results 
at other radii are very similar. 

(ii) Vikhlinin et al. (2006) fitted parametrized models to 
13 clusters, obtaining mass-temperature relations at A = 
2500 and 500 with both emission- and mass-weighted tem- 
peratures. In the figure, we show the emission-weighted slope 
for A = 500, 1.61 ±0.11. The best fitting values in the other 
cases ranged from 1.51 to 1.64. The analysis was extended 
to 17 clusters in Vikhlinin et al. (2009a), resulting in a slope 
of 1.53 ±0.08. 

(iii) Sun et al. (2009) fitted a sample of 23 groups and 14 
clusters, spanning 0.7 keV < kT < 11 keV, obtaining a slope 
of 1.65 ± 0.04 at A = 500. 

(iv) Juett et al. (2010) fitted the models of Vikhlinin et al. 
(2006) to 28 clusters with kT > 2keV, finding a slope of 
1.67 ±0.16 at A = 500. 

The mass-temperature slopes that rely on fully para- 
metric fits to n(r) and T(r) tend to cluster in the 1.50- 
1.65 range. The exceptions are one result from Horner 
et al. (1999, using the isothermal f3 model) and one from 
Finoguenov et al. (2001, using a polytropic model). In the 
former case, Horner et al. (1999) comment that there ex- 
ists a clear correlation between measured values of To and 
P in the data (Fukazawa 1997), P oc r ( <'- 26 ± 03 . Based on 
Section 2.2, one might expect such a correlation to result in 
a steeper slope when the isothermal P model is used. The 
simple 3/2 ± a formula from Section 2.2 over-predicts the 
size of the effect: /3 3//2 % T ' 39 implies a yet steeper slope 
than was observed. The full explanation likely involves the 
effect of the third fit parameter, r c , as well as the measure- 
ment errors and the method used to fit the scaling relation. 
In Finoguenov et al. (2001), eliminating the clusters with 
the smallest j3 measurements (which also happen to be at 
the low-temperature end of the data set) reduces both the 
empirical /3-To correlation and the mass-temperature slope 
(compare their first and second results in the figure), sup- 
porting the qualitative notion that model parameter corre- 



lations contribute to steepening of the slope. Similarly, the 
Reiprich & Bohringer (2002) data set used by Finoguenov 
et al. (2001, their third result above) and Popesso et al. 
(2005) has an empirically smaller correlation between To 
and P, and fits to a correspondingly shallower slope. The 
works using more complicated n(r) and T(r) models gener- 
ally show less strong departures from the self-similar value. 

Relatively fewer authors have used an explicit prior on 
the form of the mass profile, along with a non-parametric 
description of the ICM: 

(i) The third result from Horner et al. (1999) employs 
masses from White et al. (1997), obtaining a slope of 2.06 ± 
0.10. The mass profiles were constrained by a combination 
of galaxy velocity dispersion and X-ray temperature data. 

(ii) Morandi et al. (2007) fitted a mass profile motivated 
by Rasia et al. (2004) to X-ray data for 24 hot (kT > 5keV) 
clusters, obtaining a slope of 1.7±0.4 at A = 2500. However, 
when they allow the normalization of the scaling relation to 
evolve with redshift, the measured mass-temperature slope 
is steeper, 2.30 ± 0.24. 

(iii) Allen et al. (2008) fitted an NFW mass profile to 
non-parametric surface brightness and temperature data for 
42 massive, dynamically relaxed clusters, obtaining HSE 
masses at A = 2500. The NFW profile provides an accept- 
able fit to the data (see also Schmidt & Allen 2007). Com- 
bining these mass measurements with temperatures from an 
extension of the work in Mantz et al. (2010a), we obtain a 
mass-temperature slope of 1.91 ± 0.19 (see Appendix C). 

The final result shown in the figure is from Mantz et al. 
(2010a), who used gas mass as a proxy to estimate total 
masses at A = 500 for a sample of 94 hot, massive clusters, 
obtaining a mass-temperature slope of 2.04 ±0.15. Because 
the gas mass fraction was calibrated using the data of Allen 
et al. (2008), the two results are not entirely independent. 
On the other hand, relatively few of the Mantz et al. (2010a) 
clusters are in the Allen et al. (2008) data set, so this de- 
pendence should largely be limited to the normalization of 
the scaling relation. 

Apart from the Morandi et al. (2007) slope, which has 
a large uncertainty, the results that use explicit priors on 
the form of the mass profile or employ gas mass as a proxy 
appear to prefer a relatively steep slope compared with the 
other works, ~ 2.0. Given that mass models such as the 
NFW profile are well motivated by numerical simulations 
and provide an acceptable fit to cluster data (e.g. Schmidt 
& Allen 2007), the segregation apparent in Figure 2 sug- 
gests that the implicit priors in fully parametric n(r) and 
T(r) models bias the resulting M-T slopes towards the self- 
similar value. 



5 SUMMARY 

In this paper, we have discussed the influence of priors on 
hydrostatic mass estimates of clusters and on the resulting 
mass-observable scaling relations. The use of fully paramet- 
ric gas density (or X-ray brightness) and temperature pro- 
files, similar to those commonly used in the literature, intro- 
duces an implicit prior on the form of the mass profile via 
the hydrostatic equation. Furthermore, the structure of the 
prior thus imposed results in an implicit prior on the cluster 
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scaling relations. If the parametrized models employed are 
insufficiently flexible, or conversely if they are too general 
to be constrained at the radii of interest, then constraints 
on the scaling relations will be biased towards having self- 
similar slopes. 

Alternative techniques for hydrostatic mass measure- 
ment exist which, by construction, do not suffer from this 
bias. The most common of these is a semi-parametric ap- 
proach, in which a parametric prior on the form for the 
mass profile is explicitly adopted, with the ICM described 
independently and non-parametrically. Typically, the priors 
used here are motivated by the results of numerical sim- 
ulations. An advantage of the semi-parametric approach is 
that it requires no a priori assumptions about the potentially 
complex form of the ICM density and temperature profiles, 
and that the applicability of the mass profile model can be 
straightforwardly evaluated through the goodness of fit. We 
comment further on the relative merits of various methods, 
and offer general recommendations, in Appendix B. 

In the literature, results for the mass-temperature 
slope obtained by fitting parametric n(r) and T(r) profiles 
tend to cluster relatively near the self-similar value. Semi- 
parametric analyses appear to prefer a significantly steeper 
mass-temperature relation, although there are relatively few 
such works to consider. While a variety of systematic effects 
can potentially affect the scaling relations, this segregation 
of values for the M-T slope suggests that the priors im- 
posed during mass estimation have a significant influence 
that needs to be considered carefully. 

As cluster surveys at all wavelengths are expanded to 
higher and higher redshifts, and are used to investigate more 
complex cosmological questions, accurate calibration of the 
relevant scaling relations will only become more important. 
Gravitational lensing will make a unique contribution to 
these efforts, particularly in assessing the residual bias in 
ICM-based mass estimates due to the HSE assumption. Nev- 
ertheless, ICM-based mass measurements for relaxed sys- 
tems will remain an important ingredient in cluster cosmol- 
ogy due to the higher precision and lower systematic scatter 
of individual estimates compared to lensing. It is therefore 
critical, going forward, that the priors employed in these 
measurements be minimal, straightforwardly testable, and 
well understood. 
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APPENDIX A: SIMULATIONS 

As discussed in Section 3.1, when the model parameters that 
determine the density and temperature profiles at ta are un- 
constrained or poorly constrained, the fully parametric ap- 
proach can be biased towards self-similar scaling relations. 
As an explicit illustration of this, consider a j3 model de- 
scription of the gas density in conjunction with a simple, 
non-isothermal temperature profile, 

T ( r ) = 7T- ( A1 ) 

[1 + (r/r t y] c/b 

This function is a simplification of the form used by 
Vikhlinin et al. (2006), namely eliminating the 'cool core' 
term, which is intended to describe the profile at small radii. 

To illustrate the case where these models are effectively 
unconstrained, we generated random realizations by sam- 
pling independent, uniform values of the model parameters 
within the ranges given in Table Al. The radial scales in the 
density and temperature models, r c and r t , were allowed to 
take values between zero and r max = max\/3/3To. This is the 
maximum value of r c for which the isothermal j3 model has a 
real solution for ta (Equation 10); while the same is not true 
of this non-isothermal model, allowing larger values does not 
change the resulting picture qualitatively. j3 was allowed to 
vary over a range somewhat wider than that seen in ob- 
servations, while the temperature exponents, b and c, were 
varied over approximately the range allowed by Vikhlinin 
et al. (2006). To provide an adequate baseline to observe 
the resulting scaling behavior, the temperature normaliza- 
tion, To, was sampled uniformly in the logarithm between 1 
and 1000. 

For each realization, an implicit solution for ta (Equa- 
tion 3) was searched for numerically, and models for which 
there was no real solution were discarded. A sample of the 




0.1 0.2 0.5 1.0 2.0 



r/r A 




0.5 1.0 1.5 2.0 3.0 

r/r A 

Figure Al. A sample of 10 randomized gas density and tempera- 
ture profile models, demonstrating the range of behavior spanned 
by the randomization (Table Al). 



resulting density and temperature profiles is shown in Fig- 
ure Al. The model profiles are clearly not self-similar in any 
meaningful sense, but, because their variation is indepen- 
dent of mass, the slopes of the mean scaling relations take 
on the self-similar values (right panel of Figure 1). For clar- 
ity, we have culled models where r c /rA > 0.7 from the fig- 
ure; these models produce an asymmetric tail to low masses, 
but do not change the scaling relation slope. The x-axis of 
the figure shows the emission-weighted projected tempera- 
ture within ta, calculated from the n(r) and T(r) profiles, 
although the precise definition of Ta does not affect the con- 
clusions. 



APPENDIX B: RECOMMENDATIONS 

Here we offer some brief thoughts on the task of obtaining 
hydrostatic mass estimates using minimal assumptions apart 
from hydrostatic equilibrium and spherical symmetry. Fore- 
most, it must be emphasized that invoking HSE for systems 
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that are not dynamically relaxed 5 results in significant bias 
and systematic scatter at the level of tens of per cent (e.g. 
Nagai, Vikhlinin, & Kravtsov 2007). For relaxed clusters 
and measurements made at intermediate radii (r ~ ^2500), 
bias and scatter due to residual departures from equilib- 
rium should be at the ~ 10 per cent level or less. However, 
even in relaxed clusters, the assumption of HSE should be 
avoided in the outer regions (r > r-500) where gas clumping 
(Simionescu et al. 2011) and increased non-thermal pressure 
support (Nagai et al. 2007; Pfrommer et al. 2007; Mahdavi 
et al. 2008) may occur. The same is true of the central few 
tens of kpc in systems where the influence of the central ac- 
tive galactic nucleus is often evident (e.g. Fabian et al. 2003; 
Forman et al. 2005; McNamara & Nulsen 2007). 

Ideally, a non-parametric method would be used to es- 
timate masses, for example the model recently described by 
Nulsen et al. (2010). In this particular approach, the cluster 
is modeled as a series of concentric, spherical shells, with 
constant temperature and total (dark matter + baryons) 
density in each shell. These non-parametric temperature and 
mass profiles, with the addition of an overall gas density 
normalization and under the assumption of HSE, determine 
the gas density at all radii. Although this method is advan- 
tageous in principle, in practice it is only feasible with very 
high-quality data (Nulsen et al. 2010). 

More generally, some kind of regularization, in the form 
of an analytic model, is required to constrain hydrostatic 
masses. Based on the considerations discussed in this paper, 
it is preferable to apply this model to the total mass profile 
(e.g. the NFW model) rather than the ICM. Considered as 
a modification of the above algorithm, the resulting semi- 
parametric model is parametrized by a set of temperatures 
in concentric shells, a normalization for the gas density, and 
the parameters of the chosen total mass model. Precisely this 
approach was used recently by Simionescu et al. (2011), who 
adopted an NFW description of the mass profile to model 
Suzaku data for the Perseus cluster. 6 The similar method 
of Fabian et al. (1981), in which the ICM is parametrized 
by the surface brightness in concentric annuli and a pres- 
sure normalization at large radius, has also found use in the 
literature (White et al. 1997; Allen et al. 2001, 2004, 2008; 
Schmidt et al. 2001; Schmidt & Allen 2007; Siemiginowska 
et al. 2010). We note that the use of a parametrized mass 
profile along with the assumption of HSE typically allows the 
non-parametric temperature profile to be modeled at higher 
spatial resolution than in either a fully non-parametric mass 
solution or a simple, geometric de-projection (e.g. using the 
PROJCT model in XSPEC 7 ) where the mass is not modeled at 
all. 

For fitting the scaling relations themselves, we note that 
a full treatment generically requires simultaneous modeling 
of the cluster mass function due to selection effects (Mantz 
et al. 2010b; Allen et al. 2011). Only when the intrinsic co- 
variance between the observable of interest and the observ- 

5 In X-rays, dynamically relaxed systems are generally identified 
as having sharp surface brightness peaks and minimal isophotc 
rotation or ccntroid variation. 

6 The Nulsen et al. (2010) model, including both non-parametric 
and NFW variants, is expected to be included in the next public 
release of XSPEC (P. Nulsen, private communication). 

' http : //heasarc . gsf c .nasa . gov/docs/xanadu/xspec/ 




n 1 1 1 1 — 1 — 

4 6 8 10 12 14 

kT (keV) 

Figure CI. Scaling relation from the cluster mass measurements 
of Allen et al. (2008) and temperature measurements of Mantz 
et al. (2010a). The best fitting power law, shown by the red line, 
has aslope of 1.91 ±0.19. The fit accounts for measurement errors 
in both quantities and is not sensitive to possible correlation of 
the measurement uncertainties for each cluster (see text). 

able used to select the cluster sample is sufficiently small can 
approximate results be obtained without explicitly model- 
ing the mass function and selection process. In this case, the 
analysis should still include a full treatment of heteroscedas- 
tic and possibly correlated measurement errors, and intrinsic 
scatter (e.g. Gelman et al. 2004; Kelly 2007). 



APPENDIX C: ALLEN ET AL. 2008 
MASS-TEMPERATURE RELATION 

Allen et al. (2008) used Chandra X-ray Observatory data 
to measure hydrostatic masses at r2soo for 42 hot (kT > 
4.5 keV), dynamically relaxed clusters at redshifts 0.05 < 
2 < 1.1. As described in Section 4, mass estimates were 
obtained by fitting an NFW profile to the total mass dis- 
tribution, using a non-parametric description of the ICM 
surface brightness and temperature. Average temperatures 
for some of these clusters, with a more current version of 
the Chandra calibration, were measured by Mantz et al. 
(2010a). We report here temperatures for many of the re- 
maining clusters, from data reduced using exactly the same 
procedure as described in that paper. The temperatures are 
typically measured within larger radii than r2500, but (be- 
ing emission-weighted) the difference between these mea- 
surements and true fcT^soo values is at the few per cent 
level, typically smaller than the statistical error bars (e.g. 
Vikhlinin et al. 2009a; Mantz et al. 2010a). 8 These average 
temperatures are listed in Table CI, along with the best fit- 
ting values and 68.3 per cent confidence intervals for M2500 

8 Specifically, temperatures were fit from spectra in annuli be- 
tween 100 kpc and the radius for each cluster where the signal- 
to-noise of the 0.8— 7.0keV surface brightness profile falls to 2, as 
described in Mantz et al. (2010a). These temperatures are a closer 
match spatially to r2500, and are less prone to background mod- 
cling systcmatics, than the fcTsoo values reported in that work. 
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from Allen et al. (2008). We fit a power-law to the data 
using the linmix_err algorithm of Kelly (2007), assuming 
independent, log-normal measurement errors. 9 The resulting 
best fit, shown in Figure CI, has a slope of 1.91 ± 0.19. 



9 The assumption of independent error bars violates our own ad- 
vice from Section 1 . We have explicitly verified that measurement 
error correlations of ±0.9 produce negligible change to the best 
fitting power-law slope in this case. 
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Table CI. Rcdshifts, masses at r2500, an d average temperatures for Allen et al. (2008) clusters. Mass constraints correspond to the NFW 
fits reported in Allen et al. (2008), assuming a spatially flat, cosmological constant model with Hubble parameter Hq = 70kms~ 1 Mpc - 1 
and mean matter density with respect to the critical value f2 m = 0.3. Values of the normalized Hubble parameter at each cluster's 
redshift, E(z) = H(z)/Hq oc pc(z) 1 / 2 , are given for this cosmology. Temperatures were derived as described in Mantz et al. (2010a), 
apart from Abell clusters 1795, 2029 and 478, whose temperatures are from Horner et al. (1999). To prevent the temperature data from 
becoming too heterogeneous, the five clusters which were not studied in either of these works were omitted from the analysis. (We have 
kept the Horner et al. temperatures for consistency with Mantz et al. 2010a, where they were also used; however, they do not influence 
the fit significantly.) 



Name z E(z) M2500 kT Name z E(z) M2500 kT 

(10 14 M Q ) (keV) (10 14 A/ Q ) (keV) 



Abell 1795 
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2 


7O+0.28 
' -0.34 


6 
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,u -0.88 
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Abell 2204 





152 


1 


,076 


■1 


qq+0.79 
u -0.45 


8, 


99+1.28 
zz -1.30 


MACSJ0429. 6-0253 





,399 


1 


,234 


1 


oo+0. 24 
,oo -0.29 




Abell 383 





188 


1 


,097 


2 


16 +0.34 
lo -0.28 


5 


oc+0.19 
-0.18 
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9 
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2 
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Abell 611 
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1 
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7 
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2 
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Abell 2537 
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1 
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8 
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MACSJ1423.8+2404 
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1 


,339 


2 


59 +0.31 
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1 
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2 
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1O -0.13 


5 
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OD -0.20 


MACSJ0744.9+3927 





,686 


1 


,462 


3 
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u '-0.43 


8.08ig;£ 


MACSJ0242.6-2132 
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1 
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2 
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